AITopics | relative bias

Collaborating Authors

relative bias

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Global Minimizers of Sigmoid Contrastive Loss

Neural Information Processing SystemsJun-19-2026, 02:21:20 GMT

The meta-task of obtaining and aligning representations through contrastive pretraining is steadily gaining importance since its introduction in CLIP and ALIGN. In this paper we theoretically explain the advantages of synchronizing with trainable inverse temperature and bias under the sigmoid loss, as implemented in the recent SigLIP and SigLIP2 models of Google DeepMind. Temperature and bias can drive the loss function to zero for a rich class of configurations that we call (m,brel)-Constellations. (m,brel)-Constellations are a novel combinatorial object related to spherical codes and are parametrized by a margin mand relative bias brel. We use our characterization of constellations to theoretically justify the success of SigLIP on retrieval, to explain the modality gap present in SigLIP, and to identify the necessary dimension for producing high-quality representations. Finally, we propose a reparameterization of the sigmoid loss with explicit relative bias, which improves training dynamics in experiments with synthetic data.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.28)
North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

Global Minimizers of Sigmoid Contrastive Loss

Neural Information Processing SystemsJun-13-2026, 01:06:00 GMT

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

ForeSWE: Forecasting Snow-Water Equivalent with an Uncertainty-Aware Attention Model

Thapa, Krishu K, Savalkar, Supriya, Singh, Bhupinderjeet, Hoang, Trong Nghia, Rajagopalan, Kirti, Kalyanaraman, Ananth

arXiv.org Artificial IntelligenceNov-13-2025

Various complex water management decisions are made in snow-dominant watersheds with the knowledge of Snow-Water Equivalent (SWE) -- a key measure widely used to estimate the water content of a snowpack. However, forecasting SWE is challenging because SWE is influenced by various factors including topography and an array of environmental conditions, and has therefore been observed to be spatio-temporally variable. Classical approaches to SWE forecasting have not adequately utilized these spatial/temporal correlations, nor do they provide uncertainty estimates -- which can be of significant value to the decision maker. In this paper, we present ForeSWE, a new probabilistic spatio-temporal forecasting model that integrates deep learning and classical probabilistic techniques. The resulting model features a combination of an attention mechanism to integrate spatiotemporal features and interactions, alongside a Gaussian process module that provides principled quantification of prediction uncertainty. We evaluate the model on data from 512 Snow Telemetry (SNOTEL) stations in the Western US. The results show significant improvements in both forecasting accuracy and prediction interval compared to existing approaches. The results also serve to highlight the efficacy in uncertainty estimates between different approaches. Collectively, these findings have provided a platform for deployment and feedback by the water management community.

forecasting, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.08856

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.34)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)
Food & Agriculture (0.93)
Water & Waste Management > Water Management (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Global Minimizers of Sigmoid Contrastive Loss

Bangachev, Kiril, Bresler, Guy, Noman, Iliyas, Polyanskiy, Yury

arXiv.org Artificial IntelligenceSep-24-2025

The meta-task of obtaining and aligning representations through contrastive pretraining is steadily gaining importance since its introduction in CLIP and ALIGN. In this paper we theoretically explain the advantages of synchronizing with trainable inverse temperature and bias under the sigmoid loss, as implemented in the recent SigLIP and SigLIP2 models of Google DeepMind. Temperature and bias can drive the loss function to zero for a rich class of configurations that we call $(\mathsf{m}, \mathsf{b}_{\mathsf{rel}})$-Constellations. $(\mathsf{m}, \mathsf{b}_{\mathsf{rel}})$-Constellations are a novel combinatorial object related to spherical codes and are parametrized by a margin $\mathsf{m}$ and relative bias $\mathsf{b}_{\mathsf{rel}}$. We use our characterization of constellations to theoretically justify the success of SigLIP on retrieval, to explain the modality gap present in SigLIP, and to identify the necessary dimension for producing high-quality representations. Finally, we propose a reparameterization of the sigmoid loss with explicit relative bias, which improves training dynamics in experiments with synthetic data.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.18552

Country:

North America > United States > Massachusetts (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)

Add feedback

Relative Bias: A Comparative Framework for Quantifying Bias in LLMs

Arbabi, Alireza, Kerschbaum, Florian

arXiv.org Machine LearningMay-26-2025

The growing deployment of large language models (LLMs) has amplified concerns regarding their inherent biases, raising critical questions about their fairness, safety, and societal impact. However, quantifying LLM bias remains a fundamental challenge, complicated by the ambiguity of what "bias" entails. This challenge grows as new models emerge rapidly and gain widespread use, while introducing potential biases that have not been systematically assessed. In this paper, we propose the Relative Bias framework, a method designed to assess how an LLM's behavior deviates from other LLMs within a specified target domain. We introduce two complementary methodologies: (1) Embedding Transformation analysis, which captures relative bias patterns through sentence representations over the embedding space, and (2) LLM-as-a-Judge, which employs a language model to evaluate outputs comparatively. Applying our framework to several case studies on bias and alignment scenarios following by statistical tests for validation, we find strong alignment between the two scoring methods, offering a systematic, scalable, and statistically grounded approach for comparative bias analysis in LLMs.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Machine Learning

2505.17131

Country:

Asia > China (0.16)
North America > United States (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Media (0.68)
Law > Civil Rights & Constitutional Law (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Generalized Theory of Mixup for Structure-Preserving Synthetic Data

Lee, Chungpa, Im, Jongho, Kim, Joseph H. T.

arXiv.org Machine LearningMar-3-2025

A similar approach, SMOTE (Synthetic Minority Over-sampling Technique) (Chawla et al., 2002; He et al., 2008; Bunkhumpornpat et al., 2012; Douzas et al., 2018), also leverages interpolated synthetic instances to enhance model performance particularly for imbalanced or long-tail distributions, showcasing the effectiveness of mixup methods. In this paper we place special focus on data synthesis, an important constituent of data augmentation. While there is extensive research on how synthetic data generated by mixup can enhance model performance (Carratino et al., 2022; Zhang et al., 2021), less attention has been given to understanding the fundamental properties of the synthesized data itself; see Sec. 2.1. In fact most mixup methods generate linearly interpolated instances by taking a weighted average where the weights are randomly drawn from distributions within the range of [0, 1], such as the beta or the uniform distribution. However, this interpolation process reduces the variance, which inevitably distorts the statistical structure of the original dataset both marginally and jointly. The net effect is a less dispersed dataset with more emphasis on representative instances and suppressing the others. In this regard, mixup-based synthetic datasets achieve better performance in training machine learning models from sacrificing non-representative instances, such as the tail instances, in the dataset.

dataset, synthetic data, variance, (13 more...)

arXiv.org Machine Learning

2503.02645

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Thailand (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer

Wan, Jiaxu, Zhang, Hong, He, Ziqi, Wang, Qishu, Yuan, Ding, Yang, Yifan

arXiv.org Artificial IntelligenceDec-16-2024

In 3D understanding, point transformers have yielded significant advances in broadening the receptive field. However, further enhancement of the receptive field is hindered by the constraints of grouping attention. The proxy-based model, as a hot topic in image and language feature extraction, uses global or local proxies to expand the model's receptive field. But global proxy-based methods fail to precisely determine proxy positions and are not suited for tasks like segmentation and detection in the point cloud, and exist local proxy-based methods for image face difficulties in global-local balance, proxy sampling in various point clouds, and parallel cross-attention computation for sparse association. In this paper, we present SP$^2$T, a local proxy-based dual stream point transformer, which promotes global receptive field while maintaining a balance between local and global information. To tackle robust 3D proxy sampling, we propose a spatial-wise proxy sampling with vertex-based point proxy associations, ensuring robust point-cloud sampling in many scales of point cloud. To resolve economical association computation, we introduce sparse proxy attention combined with table-based relative bias, which enables low-cost and precise interactions between proxy and point features. Comprehensive experiments across multiple datasets reveal that our model achieves SOTA performance in downstream tasks. The code has been released in https://github.com/TerenceWallel/Sparse-Proxy-Point-Transformer .

machine learning, natural language, proxy, (16 more...)

arXiv.org Artificial Intelligence

2412.1154

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Add feedback

Differentially Private Data Release on Graphs: Inefficiencies and Unfairness

Fioretto, Ferdinando, Sen, Diptangshu, Ziani, Juba

arXiv.org Artificial IntelligenceAug-8-2024

Networks are crucial components of many sectors, including telecommunications, healthcare, finance, energy, and transportation.The information carried in such networks often contains sensitive user data, like location data for commuters and packet data for online users. Therefore, when considering data release for networks, one must ensure that data release mechanisms do not leak information about individuals, quantified in a precise mathematical sense. Differential Privacy (DP) is the widely accepted, formal, state-of-the-art technique, which has found use in a variety of real-life settings including the 2020 U.S. Census, Apple users' device data, or Google's location data. Yet, the use of DP comes with new challenges, as the noise added for privacy introduces inaccuracies or biases and further, DP techniques can also distribute these biases disproportionately across different populations, inducing fairness issues. The goal of this paper is to characterize the impact of DP on bias and unfairness in the context of releasing information about networks, taking a departure from previous work which has studied these effects in the context of private population counts release (such as in the U.S. Census). To this end, we consider a network release problem where the network structure is known to all, but the weights on edges must be released privately. We consider the impact of this private release on a simple downstream decision-making task run by a third-party, which is to find the shortest path between any two pairs of nodes and recommend the best route to users. This setting is of highly practical relevance, mirroring scenarios in transportation networks, where preserving privacy while providing accurate routing information is crucial. Our work provides theoretical foundations and empirical evidence into the bias and unfairness arising due to privacy in these networked decision problems.

artificial intelligence, graph, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2408.05246

Country:

North America > United States > Virginia (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.93)

Add feedback

Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis

Struppek, Lukas (a:1:{s:5:"en_US";s:33:"Technical University of Darmstadt";}) | Hintersdorf, Dom (Technical University of Darmstadt) | Friedrich, Felix (Technical University of Darmstadt) | br, Manuel (Technical University of Darmstadt) | Schramowski, Patrick (Technical University of Darmstadt) | Kersting, Kristian (Technical University of Darmstadt)

Journal of Artificial Intelligence ResearchDec-18-2023

Models for text-to-image synthesis, such as DALL-E 2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public. These models are capable of producing high-quality images that depict a variety of concepts and styles when conditioned on textual descriptions. However, these models adopt cultural characteristics associated with specific Unicode scripts from their vast amount of training data, which may not be immediately apparent. We show that by simply inserting single non-Latin characters in the textual description, common models reflect cultural biases in their generated images. We analyze this behavior both qualitatively and quantitatively and identify a model's text encoder as the root cause of the phenomenon. Such behavior can be interpreted as a model feature, offering users a simple way to customize the image generation and reflect their own cultural background. Yet, malicious users or service providers may also try to intentionally bias the image generation. One goal might be to create racist stereotypes by replacing Latin characters with similarly-looking characters from non-Latin scripts, so-called homoglyphs. To mitigate such unnoticed script attacks, we propose a novel homoglyph unlearning method to fine-tune a text encoder, making it robust against homoglyph manipulations.

encoder, homoglyph, non-latin character, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.15388

AI Access Foundation

15388

Journal of Artificial Intelligence Research

Country:

Europe > Greece (0.14)
North America > United States (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
(37 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.91)

Add feedback

Testing Relative Fairness in Human Decisions With Machine Learning

Yu, Zhe, Xi, Xiaoyin

arXiv.org Artificial IntelligenceDec-17-2023

Fairness in decision-making has been a long-standing issue in our society. Compared to algorithmic fairness, fairness in human decisions is even more important since there are processes where humans make the final decisions and that machine learning models inherit bias from the human decisions they were trained on. However, the standard for fairness in human decisions are highly subjective and contextual. This leads to the difficulty for testing "absolute" fairness in human decisions. To bypass this issue, this work aims to test relative fairness in human decisions. That is, instead of defining what are "absolute" fair decisions, we check the relative fairness of one decision set against another. An example outcome can be: Decision Set A favors female over male more than Decision Set B. Such relative fairness has the following benefits: (1) it avoids the ambiguous and contradictory definition of "absolute" fair decisions; (2) it reveals the relative preference and bias between different human decisions; (3) if a reference set of decisions is provided, relative fairness of other decision sets against this reference set can reflect whether those decision sets are fair by the standard of that reference set. We define the relative fairness with statistical tests (null hypothesis and effect size tests) of the decision differences across each sensitive group. Furthermore, we show that a machine learning model trained on the human decisions can inherit the bias/preference and therefore can be utilized to estimate the relative fairness between two decision sets made on different data.

human decision, relative bias, relative fairness, (15 more...)

arXiv.org Artificial Intelligence

2112.11279

Country: North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre:

Research Report > Experimental Study (0.95)
Research Report > New Finding (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback